[GLUTEN-7745][VL] Incorporate SQL Union operator into Velox execution pipeline #7842

zhztheplayer · 2024-11-07T08:37:43Z

Edit: The single-threaded execution issue will be fixed in Velox facebookincubator/velox@e80bf12.

Introduce new config spark.gluten.sql.native.union (by default, false) which works for Velox backend only at the moment.

Native union will not be used when children of union operator don't share the same known partition number. (perhaps can be fixed in future)

There is a Spark UT failing when enabling native union, it's related to Velox LocalParition output order and requires fixing in future:

2024-11-29T12:48:14.0003314Z - SPARK-14393: values generated by non-deterministic functions shouldn't change after coalesce or union *** FAILED ***
2024-11-29T12:48:14.0004432Z   Array([0,0], [2,1], [1,8589934592], [3,8589934593]) did not equal Array([0,0], [1,8589934592], [2,0], [3,8589934592]) Values changed after union when codegenFallback=true and wholeStage=false. (DataFrameFunctionsSuite.scala:3780)

So far the failure is benign since in SQL we don't often guarantee output ordering of union operator.

~~This is currently a PoC and not yet workable because of a blocker from Velox's single-threaded execution design:~~

I20241107 15:06:23.310680 1849235 VeloxPlanConverter.cc:128] Plan Node: 
-- Aggregation[7][PARTIAL n7_0 := count_partial("n0_0")] -> n7_0:BIGINT
  -- LocalPartition[6][GATHER] -> n0_0:BIGINT
    -- Project[4][expressions: (n0_0:BIGINT, "n1_1")] -> n0_0:BIGINT
      -- Project[1][expressions: (n1_1:BIGINT, "n0_0")] -> n1_1:BIGINT
        -- TableScan[0][table: hive_table] -> n0_0:BIGINT
    -- Project[5][expressions: (n0_0:BIGINT, "n3_1")] -> n0_0:BIGINT
      -- Project[3][expressions: (n3_1:BIGINT, "n2_0")] -> n3_1:BIGINT
        -- TableScan[2][table: hive_table] -> n2_0:BIGINT
24/11/07 15:06:23 ERROR TaskResources: Task 11 failed by error: 
org.apache.gluten.exception.GlutenException: Task doesn't support single thread execution: -- Aggregation[7]

	at org.apache.gluten.vectorized.PlanEvaluatorJniWrapper.nativeCreateKernelWithIterator(Native Method)
	at org.apache.gluten.vectorized.NativePlanEvaluator.createKernelWithBatchIterator(NativePlanEvaluator.java:66)
	at org.apache.gluten.backendsapi.velox.VeloxIteratorApi.genFirstStageIterator(VeloxIteratorApi.scala:214)
	at org.apache.gluten.execution.GlutenWholeStageColumnarRDD.$anonfun$compute$1(GlutenWholeStageColumnarRDD.scala:88)
	at org.apache.gluten.utils.Arm$.withResource(Arm.scala:25)
	at org.apache.gluten.metrics.GlutenTimeMetric$.millis(GlutenTimeMetric.scala:37)
	at org.apache.gluten.execution.GlutenWholeStageColumnarRDD.compute(GlutenWholeStageColumnarRDD.scala:77)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:104)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:54)
	at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:166)
	at org.apache.spark.scheduler.Task.run(Task.scala:141)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:620)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally(SparkErrorUtils.scala:64)
	at org.apache.spark.util.SparkErrorUtils.tryWithSafeFinally$(SparkErrorUtils.scala:61)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:94)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:623)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

~~Related Velox code:~~

https://github.com/facebookincubator/velox/blob/12b52e70ec85ae0cdb4aa990797cddda9be5be27/velox/exec/Driver.h#L658-L660

github-actions · 2024-11-07T08:38:03Z

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Other pull requests

github-actions · 2024-11-07T08:38:14Z

Run Gluten Clickhouse CI on x86

backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxMetricsApi.scala

github-actions · 2024-11-11T08:45:07Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-11T08:55:07Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-11T09:30:05Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-11T10:59:27Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-11T11:48:06Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-11T11:50:49Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-19T05:16:25Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-19T07:55:50Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-28T06:47:15Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-11-29T11:51:10Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-02T00:14:11Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-02T05:04:23Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-02T05:51:43Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-02T05:57:54Z

#7745

github-actions · 2024-12-02T06:01:48Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-02T06:30:44Z

Run Gluten Clickhouse CI on x86

github-actions · 2024-12-02T07:29:57Z

Run Gluten Clickhouse CI on x86

PHILO-HE

Looks good. Some minor comments.

PHILO-HE · 2024-12-02T06:56:37Z

gluten-substrait/src/test/scala/org/apache/gluten/execution/WholeStageTransformerSuite.scala

      expected = df.collect()
    }
-    // By default we will fallabck complex type scan but here we should allow
+    // By default, we will fallabck complex type scan but here we should allow


Please also fix typo: fallabck.

PHILO-HE · 2024-12-02T06:58:17Z

gluten-substrait/src/test/scala/org/apache/gluten/execution/WholeStageTransformerSuite.scala

  /**
   * run a query with native engine as well as vanilla spark then compare the result set for
   * correctness check
   */
-  protected def compareResultsAgainstVanillaSpark(
-      sqlStr: String,
+  protected def compareDfResultsAgainstVanillaSpark(


Df means Dataframe? The naming seems not friendly to reader.

Let's just use DF? It's a common abbreviation of DataFrame in spark. I think we'd avoid making names too long.

@zhztheplayer, I see. It's ok to firstly use your proposed name. Thanks!

PHILO-HE · 2024-12-02T07:03:40Z

gluten-substrait/src/main/scala/org/apache/gluten/extension/columnar/UnionTransformerRule.scala

+    }
+  }
+
+  private def sameNumPartitions(plans: Seq[SparkPlan]): Boolean = {


Can we move this into the below validate function?

Its purpose is to make the code self-explanatory. Say reader will know we are comparing the partition numbers of children because they see sameNumPartitons as method name.

PHILO-HE · 2024-12-02T07:41:05Z

cpp/velox/substrait/SubstraitToVeloxPlanValidator.h

-  bool validateInputTypes(const ::substrait::extensions::AdvancedExtension& extension, std::vector<TypePtr>& types);
+  /// Used to get types from advanced extension and validate them, then convert to a Velox type that has arbitrary
+  /// levels of nesting.
+  bool validateInputVeloxType(const ::substrait::extensions::AdvancedExtension& extension, TypePtr& out);


validateInputVeloxType makes me feel that the input is velox type. Can we use some other name, e.g., parseToVeloxType?

Changing to parseVeloxType. Thanks.

github-actions · 2024-12-02T08:11:35Z

Run Gluten Clickhouse CI on x86

zhztheplayer · 2024-12-02T08:24:12Z

cpp/velox/compute/WholeStageResultIterator.cc

+    // FIXME: The whole metrics system in gluten-substrait is magic. Passing metrics trees through JNI with a trivial
+    //  array is possible but requires for a solid design. Apparently we haven't had it. All the code requires complete
+    //  rework.


I found it's really painful to add new metrics in Gluten Velox backend. There is a set of metrics from Velox query plan being converted to a tree or an array again and again during passing to Spark Java side. A lot of magical IDs are used during the tree / array traversal algorithms with a lot of special handling of corner cases, e.g., join, filter project, and union which is added in this PR.

The whole metric system should be reworked (if someone could lead this work) otherwise it's unmaintainable.

github-actions · 2024-12-02T08:43:51Z

Run Gluten Clickhouse CI on x86

github-actions bot added CORE works for Gluten Core VELOX labels Nov 7, 2024

zhztheplayer changed the title ~~PoC: [VL] Incorporate SQL Union operator into Velox execution pipeline~~ PoC: [GLUTEN-7745][VL] Incorporate SQL Union operator into Velox execution pipeline Nov 7, 2024

zhztheplayer commented Nov 7, 2024

View reviewed changes

backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxMetricsApi.scala Outdated Show resolved Hide resolved

zhztheplayer mentioned this pull request Nov 7, 2024

[VL] Incorporate SQL Union operator into Velox execution pipeline #7745

Open

zhztheplayer force-pushed the wip-native-union branch from 0459eec to 543b2d0 Compare November 11, 2024 08:44

github-actions bot added the BUILD label Nov 11, 2024

zhztheplayer force-pushed the wip-native-union branch from ffc6b69 to 6b22f18 Compare November 11, 2024 11:47

zhztheplayer force-pushed the wip-native-union branch from 8e32171 to 725137d Compare November 19, 2024 05:15

github-actions bot removed the BUILD label Nov 19, 2024

fixup

8736e5b

zhztheplayer force-pushed the wip-native-union branch from 68550df to 8736e5b Compare November 28, 2024 06:46

zhztheplayer added 4 commits November 29, 2024 13:54

fixup

ce96179

fixup

2d8d331

fixup

67ed837

fixup

c209bc7

zhztheplayer added 2 commits December 1, 2024 18:09

fixup

e6f1302

fixup

289d3a3

fixup

32d97a1

zhztheplayer marked this pull request as ready for review December 2, 2024 00:22

zhztheplayer added 4 commits December 2, 2024 08:48

fixup

4b9f8a1

fixup

3ec878e

fixup

a9b81f1

fixup

a036ed3

fixup

f0c0e4e

zhztheplayer changed the title ~~PoC: [GLUTEN-7745][VL] Incorporate SQL Union operator into Velox execution pipeline~~ [GLUTEN-7745][VL] Incorporate SQL Union operator into Velox execution pipeline Dec 2, 2024

fixup

1365cb1

fixup

a085e2f

fixup

594ca87

PHILO-HE reviewed Dec 2, 2024

View reviewed changes

fixup

8921a0d

zhztheplayer commented Dec 2, 2024

View reviewed changes

fixup

cd3e456

github-actions bot added the CLICKHOUSE label Dec 2, 2024

PHILO-HE approved these changes Dec 3, 2024

View reviewed changes

zhztheplayer merged commit 6dd91ba into apache:main Dec 3, 2024
49 checks passed

zhztheplayer mentioned this pull request Dec 13, 2024

[GLUTEN-8187][VL] Support velox cache metrics #8188

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GLUTEN-7745][VL] Incorporate SQL Union operator into Velox execution pipeline #7842

[GLUTEN-7745][VL] Incorporate SQL Union operator into Velox execution pipeline #7842

zhztheplayer commented Nov 7, 2024 •

edited

Loading

github-actions bot commented Nov 7, 2024

github-actions bot commented Nov 7, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 19, 2024

github-actions bot commented Nov 19, 2024

github-actions bot commented Nov 28, 2024

github-actions bot commented Nov 29, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

PHILO-HE left a comment

PHILO-HE Dec 2, 2024

zhztheplayer Dec 2, 2024

PHILO-HE Dec 2, 2024

zhztheplayer Dec 2, 2024

PHILO-HE Dec 2, 2024

PHILO-HE Dec 2, 2024

zhztheplayer Dec 2, 2024

PHILO-HE Dec 2, 2024

zhztheplayer Dec 2, 2024

github-actions bot commented Dec 2, 2024

zhztheplayer Dec 2, 2024 •

edited

Loading

github-actions bot commented Dec 2, 2024

[GLUTEN-7745][VL] Incorporate SQL Union operator into Velox execution pipeline #7842

[GLUTEN-7745][VL] Incorporate SQL Union operator into Velox execution pipeline #7842

Conversation

zhztheplayer commented Nov 7, 2024 • edited Loading

github-actions bot commented Nov 7, 2024

github-actions bot commented Nov 7, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 11, 2024

github-actions bot commented Nov 19, 2024

github-actions bot commented Nov 19, 2024

github-actions bot commented Nov 28, 2024

github-actions bot commented Nov 29, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

github-actions bot commented Dec 2, 2024

PHILO-HE left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Dec 2, 2024

zhztheplayer Dec 2, 2024 • edited Loading

Choose a reason for hiding this comment

github-actions bot commented Dec 2, 2024

zhztheplayer commented Nov 7, 2024 •

edited

Loading

zhztheplayer Dec 2, 2024 •

edited

Loading